Quality-based Rewards for Monte-Carlo Tree Search Simulations

نویسندگان

Tom Pepels

Mandy J. W. Tak

Marc Lanctot

Mark H. M. Winands

چکیده

Monte-Carlo Tree Search is a best-first search technique based on simulations to sample the state space of a decision-making problem. In games, positions are evaluated based on estimates obtained from rewards of numerous randomized play-outs. Generally, rewards from play-outs are discrete values representing the outcome of the game (loss, draw, or win), e.g., r ∈ {−1, 0, 1}, which are backpropagated from expanded leaf nodes to the root node. However, a play-out may provide additional information. In this paper, we introduce new measures for assessing the a posteriori quality of a simulation. We show that altering the rewards of play-outs based on their assessed quality improves results in six distinct two-player games and in the General Game Playing agent CADIAPLAYER. We propose two specific enhancements, the Relative Bonus and Qualitative Bonus. Both are used as control variates, a variance reduction method for statistical simulation. Relative Bonus is based on the number of moves made during a simulation and Qualitative Bonus relies on a domain-dependent assessment of the game’s terminal state. We show that the proposed enhancements, both separate and combined, lead to significant performance increases in the domains discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft

Deep reinforcement learning has been successfully applied to several visual-input tasks using model-free methods. In this paper, we propose a model-based approach that combines learning a DNN-based transition model with Monte Carlo tree search to solve a block-placing task in Minecraft. Our learned transition model predicts the next frame and the rewards one step ahead given the last four frame...

متن کامل

A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters

Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...

متن کامل

Multiple Overlapping Tiles for Contextual Monte Carlo Tree Search

Monte Carlo Tree Search is a recent algorithm that achieves more and more successes in various domains. We propose an improvement of the Monte Carlo part of the algorithm by modifying the simulations depending on the context. The modification is based on a reward function learned on a tiling of the space of Monte Carlo simulations. The tiling is done by regrouping the Monte Carlo simulations wh...

متن کامل

Monte-Carlo Hex

We present YOPT a program that plays Hex using Monte-Carlo tree search. We describe heuristics that improve simulations and tree search. We also address the combination of Monte-Carlo tree search with virtual connection search.

متن کامل

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

In this paper enhancements for the Monte-Carlo Tree Search (MCTS) framework are investigated to play Ms Pac-Man. MCTS is used to find an optimal path for an agent at each turn, determining the move to make based on randomised simulations. Ms Pac-Man is a real-time arcade game, in which the protagonist has several independent goals but no conclusive terminal state. Unlike games such as Chess or ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Quality-based Rewards for Monte-Carlo Tree Search Simulations

نویسندگان

چکیده

منابع مشابه

Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft

A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters

Multiple Overlapping Tiles for Contextual Monte Carlo Tree Search

Monte-Carlo Hex

Enhancements for Monte-Carlo Tree Search in Ms Pac-Man

عنوان ژورنال:

اشتراک گذاری